
\documentclass[11pt]{article} % use larger type; default would be 10pt
\usepackage{framed}
\usepackage[utf8]{inputenc} % set input encoding (not needed with XeLaTeX)
\usepackage{geometry} % to change the page dimensions
\geometry{a4paper} % or letterpaper (US) or a5paper or....
% \geometry{margin=2in} % for example, change the margins to 2 inches all round
% \geometry{landscape} % set up the page for landscape
%   read geometry.pdf for detailed page layout information

\usepackage{graphicx} % support the \includegraphics command and options

% \usepackage[parfill]{parskip} % Activate to begin paragraphs with an empty line rather than an indent

%%% PACKAGES
\usepackage{booktabs} % for much better looking tables
\usepackage{array} % for better arrays (eg matrices) in maths
\usepackage{paralist} % very flexible & customisable lists (eg. enumerate/itemize, etc.)
\usepackage{verbatim} % adds environment for commenting out blocks of text & for better verbatim
\usepackage{subfig} % make it possible to include more than one captioned figure/table in a single float
% These packages are all incorporated in the memoir class to one degree or another...
\usepackage{framed}

%%% HEADERS & FOOTERS
\usepackage{fancyhdr} % This should be set AFTER setting up the page geometry
\pagestyle{fancy} % options: empty , plain , fancy
\renewcommand{\headrulewidth}{0pt} % customise the layout...
\lhead{}\chead{}\rhead{}
\lfoot{}\cfoot{\thepage}\rfoot{}

%%% SECTION TITLE APPEARANCE
\usepackage{sectsty}
\allsectionsfont{\sffamily\mdseries\upshape} % (See the fntguide.pdf for font help)
% (This matches ConTeXt defaults)

%%% ToC (table of contents) APPEARANCE
\usepackage[nottoc,notlof,notlot]{tocbibind} % Put the bibliography in the ToC
\usepackage[titles,subfigure]{tocloft} % Alter the style of the Table of Contents
\renewcommand{\cftsecfont}{\rmfamily\mdseries\upshape}
\renewcommand{\cftsecpagefont}{\rmfamily\mdseries\upshape} % No bold!
\begin{document}


\section{ Machine Learning Application: Photo OCR}


\subsection*{Question  1.}
Suppose you are running a sliding window detector to find

text in images. Your input images are $1000 \times 1000$ pixels. You

will run your sliding windows detector at two scales, $10 \times 10$

and $20 \times 20$ (i.e., you will run your classifier on lots of  $10 \times 10$

patches to decide if they contain text or not; and also on

lots of $20 \times 20$ patches), and you will "step" your detector by 2

pixels each time. About how many times will you end up

running your classifier on a single $1000 \times 1000$ test set image?

CORRECT 500,000

250,000

1,000,000

100,000

%===================================================================%
\subsection*{Question  2. }
Suppose that you just joined a product team that has been

developing a machine learning application, using m=1,000
training examples. You discover that you have the option of

hiring additional personnel to help collect and label data.

You estimate that you would have to pay each of the labellers

\$10 per hour, and that each labeller can label 4 examples per

minute. About how much will it cost to hire labellers to

label 10,000 new training examples?

\$10,000

\$250

\$600

CORRECT \$400

%===================================================================%
\subsection*{Question  3. }
What are the benefits of performing a ceiling analysis? Check all that apply.

CORRECT It can help indicate that certain components of a system might not be worth a significant amount of work improving, because even if it had perfect performance its impact on the overall system may be small.

If we have a low-performing component, the ceiling analysis can tell us if that component has a high bias problem or a high variance problem.

CORRECT It helps us decide on allocation of resources in terms of which component in a machine learning pipeline to spend more effort on.

It is a way of providing additional training data to the algorithm.

%===================================================================%
\subsection*{Question  4. }
Suppose you are building an object classifier, that takes as input an image, and recognizes that image as either containing a car (y=1) or not (y=0). For example, here are a positive example and a negative example:


After carefully analyzing the performance of your algorithm, you conclude that you need more positive (y=1) training examples. Which of the following might be a good way to get additional positive examples?


CORRECT Mirror your training images across the vertical axis (so that a left-facing car now becomes a right-facing one).

Take a few images from your training set, and add random, gaussian noise to every pixel.

Take a training example and set a random subset of its pixel to 0 to generate a new example.

Select two car images and average them to make a third example.

%===================================================================%
\subsection*{Question  5. }
Suppose you have a PhotoOCR system, where you have the following pipeline:


You have decided to perform a ceiling analysis on this system, and find the following:


Which of the following statements are true?


SELECTED There is a large gain in performance possible in improving the character recognition system.

Performing the ceiling analysis shown here requires that we have ground-truth labels for the text detection, character segmentation and the character recognition systems.

WRONG The least promising component to work on is the character recognition system, since it is already obtaining 100\% accuracy.
\textit{The character recogntion component is th emost poromsing, as ground truth character recogntion  impresoves performance by 18\% over feeding the currenct charcter recongition sysystenm ground truth character segmentation.}

SELECTED The most promising component to work on is the text detection system, since it has the lowest performance (72\%) and thus the biggest potential gain.

\end{document}


